Visualizing Statistical Mix Effects and Simpson's Paradox
نویسندگان
چکیده
We discuss how "mix effects" can surprise users of visualizations and potentially lead them to incorrect conclusions. This statistical issue (also known as "omitted variable bias" or, in extreme cases, as "Simpson's paradox") is widespread and can affect any visualization in which the quantity of interest is an aggregated value such as a weighted sum or average. Our first contribution is to document how mix effects can be a serious issue for visualizations, and we analyze how mix effects can cause problems in a variety of popular visualization techniques, from bar charts to treemaps. Our second contribution is a new technique, the "comet chart," that is meant to ameliorate some of these issues.
منابع مشابه
Simpson's Paradox and Cornfield’s Conditions
Simpson's Paradox occurs when an observed association is spurious – reversed after taking into account a confounding factor. At best, Simpson's Paradox is used to argue that association is not causation. At worst, Simpson's Paradox is used to argue that induction is impossible in observational studies (that all arguments from association to causation are equally suspect) since any association c...
متن کاملSimpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon – the reversal paradox
This article discusses three statistical paradoxes that pervade epidemiological research: Simpson's paradox, Lord's paradox, and suppression. These paradoxes have important implications for the interpretation of evidence from observational studies. This article uses hypothetical scenarios to illustrate how the three paradoxes are different manifestations of one phenomenon--the reversal paradox-...
متن کاملRevisiting Simpson’s Paradox: statistically warranted vs. unwarranted inference results
The primary objective of this paper is to revisit Simpson’s paradox using a statistical misspecification perspective. It is argued that the reversal of statistical associations is sometimes spurious, stemming from invalid probabilistic assumptions imposed on the data. The concept of statistical misspecification is used to formalize the vague term ‘spurious results’ as ‘statistically untrustwort...
متن کاملThe Inverse Simpson Paradox ( How to win without overtly cheating )
Given two sets of data which lead to a similar statistical conclusion, the Simpson Paradox [10] describes the tactic of combining these two sets and achieving the opposite conclusion. Depending upon the given data, this may or may not succeed. Inverse Simpson is a method of decomposing a given set of comparison data into two disjoint sets and achieving the opposite conclusion for each one. This...
متن کاملHow Likely is Simpson's Paradox in Path Models?
Simpson’s paradox is a phenomenon arising from multivariate statistical analyses that often leads to paradoxical conclusions; in the field of e-collaboration as well as many other fields where multivariate methods are employed. We derive a general inequality for the occurrence of Simpson’s paradox in path models with or without latent variables. The inequality is then used to estimate the proba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE transactions on visualization and computer graphics
دوره 20 12 شماره
صفحات -
تاریخ انتشار 2014